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Abstract 

The k-fo rest problem is acommon generalization of both the fc-M5Tand the dense- k-subgraph problems. 
Formally, given a metric space on n vertices V, with m demand pairs C V x V and a "target" k < m, 
the goal is to find a minimum cost subgraph that connects at least k demand pairs. In this paper, we 
give an 0(m\ii{^fn, V~k}) -approximation algorithm for fc-forest, improving on the previous best ratio 
of 0(n 2 / 3 logn) by Segev & Segev ISS06l . 

We then apply our algorithm for fc-forest to obtain approximation algorithms for several Dial-a-Ride 
problems. The basic Dial-a-Ride problem is the following: given an n point metric space with m 
objects each with its own source and destination, and a vehicle capable of carrying at most k ob- 
jects at any time, find the minimum length tour that uses this vehicle to move each object from its 
source to destination. We prove that an a-approximation algorithm for the fc-forest problem implies 
an 0(a ■ log 2 n) -approximation algorithm for Dial-a-Ride. Using our results for fc-forest, we get an 
0(min{ v / n, Vk} ■ log 2 n) -approximation algorithm for Dial-a-Ride. The only previous result known 
for Dial-a-Ride was an 0{^/k logn) -approximation by Charikar & Raghavachari HCR98I ; our results 
give a different proof of a similar approximation guarantee — in fact, when the vehicle capacity k is 
large, we give a slight improvement on their results. 

The reduction from Dial-a-Ride to the fc-forest problem is fairly robust, and allows us to obtain ap- 
proximation algorithms (with the same guarantee) for the following generalizations: (i) Non-uniform 
Dial-a-Ride, where the cost of traversing each edge is an arbitrary non-decreasing function of the num- 
ber of objects in the vehicle; and (ii) Weighted Dial-a-Ride, where demands are allowed to have dif- 
ferent weights. The reduction is essential, as it is unclear how to extend the techniques of Charikar & 
Raghavachari to these Dial-a-Ride generalizations. 

1 Introduction 

In the Steiner forest problem, we are given a set of vertex-pairs, and the goal is to find a forest such that 
each vertex pair lies in the same tree in the forest. This is a generalization of the Steiner tree problem, 
where all the pairs contain a common vertex called the root; both the tree and forest versions are well- 
understood fundamental problems in network design MAKR91I IGW921 . An important extension of the 
Steiner tree problem studied in the late 1990s was the fc-MST problem, where one sought the least-cost 
tree that connected any k of the terminals: several approximations algorithms were given for the problem, 
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culminating in the 2-approximation of Garg IIGar05lk the /c-MST problem proved crucial in many subsequent 
developments in network design and vehicle routing MCGRT031 IFHR031 lBCK+031 IBBCM04I . One can 
analogously define the /c-forest problem where one needs to connect only k of the pairs in some Steiner 
forest instance: surprisingly, very little is known about this problem, which was first studied formally as 
recently as last year HHJ061|SS061 . In this paper, we give a simpler and improved approximation algorithm 
for the /c-forest problem. 

Moreover, just like the /c-MST variant, the /c-forest problem seems to be useful in applications to network 
design and vehicle routing. In the second half of the paper, we show a (somewhat surprising) reduction of 
a well-studied vehicle routing problem called the Dial-a-Ride problem to the /c-forest problem. In the Dial- 
a-Ride problem, we are given a metric space with people having sources and destinations, and a bus of 
some capacity k; the goal is to find a route for this bus so that each person can be taken from her source 
to destination without exceeding the capacity of the bus at any point, such that the length of the bus route 
is minimized. We show how the results for the /c-forest problem slightly improve upon existing results for 
the Dial-a-Ride problem; in fact, they give the first approximation algorithms for some generalizations of 
Dial-a-Ride which do not seem amenable to previous techniques. 

1.1 The A>Forest Problem 

Our starting point is the /c-forest problem, which generalizes both the /c-MST and the dense-/c-subgraph 
problems. 

Definition 1 (The /c-Forest Problem) Given an n-vertex metric space (V, d), and demands {sj, U}'^ =1 C 

V x V, find the least-cost subgraph that connects at least k demand-pairs. 

Note that the Zc-forest problem is a generalization of the (minimization version of the) well-studied dense-/c- 
subgraph problem, for which nothing better than an 0{n l ^~ 5 ) approximation is known. The /c-forest prob- 
lem was first defined in HHJ06II . and the first non-trivial approximation was given by Segev and Segev IIS S 061 . 
who gave an algorithm with an approximation guarantee of 0(n 2 / 3 log n) for the case when k = 0(poly (n)). 
We improve the approximation guarantee of the /c-forest problem to 0(min{y / n, Vk})\ formally, we prove 
the following theorem in Section [2] 

Theorem 2 (Approximating /c-forest) There is an ©(minjy^n • j^j^, V^}) -approximation algorithm for 
the k-forest problem. For the case when k is less than a polynomial in n, the approximation guarantee 
improves to 0(mm{^/n, Vk}). 

Apart from giving an improved approximation guarantee, our algorithm for the /c-forest problem is 
arguably simpler and more direct than that of [SS06] (which is based on Lagrangian relaxations for the 
problem, and combining solutions to this relaxation). Indeed, we give two algorithms, both reducing the k- 
forest problem to the /c-MST problem in different ways and achieving different approximation guarantees — 
we then return the better of the two answers. The first algorithm (giving an approximation of 0(Vk)) uses 
the /c-MST algorithm to find good solutions on the sources and the sinks independently, and then uses the 
Erdos-Szekeres theorem on monotone subsequences to find a "good" subset of these sources and sinks to 
connect cheaply; details are given in Section I2TT1 The second algorithm starts off with a single vertex as 
the initial solution, and uses the /c-MST algorithm to repeatedly find a low-cost tree that satisfies a large 
number of demands which have one endpoint in the current solution and the other endpoint outside; this tree 
is then used to greedily augment the current solution and proceed. Choosing the parameters (as described in 
Section I2T2I gives us an 0{y/n) approximation. 
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1.2 The Dial-a-Ride Problem 

In this paper, we use the fc-forest problem to give approximation algorithms for the following vehicle routing 
problem. 

Definition 3 (The Dial-a-Ride Problem) Given an n-vertex metric space (V, d), a starting vertex (or mot) r, 
a set of m demands {(sj, ti)}™i, and a vehicle of capacity k, find a minimum length tour of the vehicle start- 
ing (and ending) at r that moves each object ifrom its source Sj to its destination ti such that the vehicle 
carries at most k objects at any point on the tour. 

We say that an object is preempted if, after being picked up from its source, it can be left at some intermediate 
vertices before being delivered to its destination. In thispaper, we will not allow this, and will mainly be 
concerned with the non-preemptive Dial-a-Ride problem^] 

The approximability of the Dial-a-Ride problem is not very well understood: the previous best upper 
bound is an 0(y/k log n)-approximation algorithm due to Charikar and Raghavachari MCR98II . whereas 
the best lower bound that we are aware of is APX-hardness (from TSP, say). We establish the following 
(somewhat surprising) connection between the Dial-a-Ride and /c-forest problems in Section [3] 

Theorem 4 (Reducing Dial-a-Ride to /c-forest) Given an a-approximation algorithm for k-forest, there is 
an 0(a ■ log 2 n)-approximation algorithm for the Dial-a-Ride problem. 

In particular, combining Theorems [2] and [4] gives us an 0(mm{y/k, yfn} ■ log 2 ra)-approximation guarantee 
for Dial-a-Ride. Of course, improving the approximation guarantee for /c-forest would improve the result 
for Dial-a-Ride as well. 

Note that our results match the results of JCR98] up to a logarithmic term, and even give a slight im- 
provement when the vehicle capacity k^> n, the number of nodes. Much more interestingly, our algorithm 
for Dial-a-Ride easily extends to generalizations of the Dial-a-Ride problem. In particular, we consider a 
substantially more general vehicle routing problem where the vehicle has no a priori capacity, and instead 
the cost of traversing each edge e is an arbitrary non-decreasing function c e (l) of the number of objects I 
in the vehicle; setting c e (l) to the edge-length d e when I < k, and c e (l) = oo for / > k gives us back the 
classical Dial-a-Ride setting. In Section 13.21 we show that this general non-uniform Dial-a-Ride problem 
admits an approximation guarantee that matches the best known for the classical Dial-a-Ride problem. An- 
other extension we consider is the weighted Dial-a-Ride problem. In this, each object may have a different 
size, and total size of the items in the vehicle must be bounded by the vehicle capacity; this has been earlier 
studied as the pickup and delivery problem HSS951 We show in Section l3~31 that this problem can be reduced 
to the (unweighted) Dial-a-Ride problem at the loss of only a constant factor in the approximation guarantee. 

As an aside, we consider the effect of preemptions in the Dial-a-Ride problem (SectionHJ). It was shown 
in Charikar & Raghavachari JCR98 ] that the gap between the optimal preemptive and non-preemptive tours 
could be as large as ^(n 1 / 3 ). We show that the real difference arises between zero and one preemptions: 
allowing multiple preemptions does not give us much added power. In particular, we show in Section [16] 
that for any instance of the Dial-a-Ride problem, there is a tour that preempts each object at most once and 
has length at most 0(log 2 n) times an optimal preemptive tour (which may preempt each object an arbitrary 
number of times). Motivated by obtaining a better guarantee for Dial-a-Ride on the Euclidean plane, we 

'A note on the parameters: a feasible non-preemptive tour can be short-cut over vertices that do not participate in any demand, 
and we can assume that every vertex is an end point of some demand, and n < 2m. We may also assume, by preprocessing some 
demands, that m < n 2 ■ k. However in general, the number of demands m and the vehicle capacity k may be much larger than the 
number of vertices n. 
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also study the preemption gap in such instances. We show that even in this case, there are instances having 
a gap of ^(n 1 / 8 ) between optimal preemptive and non-preemptive tours. 

1.3 Related Work 

The fc-forest problem: The /c-forest problem is relatively new: it was defined by Hajiaghayi & Jain MHJ06L 



An 0(/c 2 / 3 )-approximation algorithm for even the directed /c-forest problem can be inferred from | CCyC + 98 1 
Recently, Segev & Segev [ SS06 j gave an 0(ra 2//3 log n) approximation algorithm for /c-forest. 

Dense /c-subgraph: The k-forest problem is a generalization of the dense- /c-subgraph problem HFPK011 . as 
shown in MHJ061 The best known approximation guarantee for the dense-/c-subgraph problem is 0(?t, 1 / 3_<5 ) 
where 5 > is some constant, due to Feige et al. MFPK01I . and obtaining an improved guarantee has been a 
long standing open problem. Strictly speaking, Feige et al. MFPK011 study a potentially harder problem: the 
maximization version of dense-fc-subgraph, where one wants to pick k vertices to maximize the number of 
edges in the induced graph. However, nothing better is known even for the minimization version of dense-/c- 
subgraph (where one wants to pick the minimum number of vertices that induce k edges), which is a special 
case of /c-forest. The /c-forest problem is also a generalization of fc-MST, for which a 2-approximation is 
known (Garg HGar05in . 

Dial-a-Ride: While the Dial-a-Ride problem has been studied extensively in the operations research litera- 
ture, relatively little is known about its approximability. The currently best known approximation ratio for 
Dial-a-Ride is 0(Vk log n) due to Charikar & Raghavachari HCR98H . We note that their algorithm assumes 
instances with unweighted demands. Krumke et al. MKRW001 give a 3-approximation algorithm for the 
Dial-a-Ride problem on a line metric; in fact, their algorithm finds a non-preemptive tour that has length at 
most 3 times the preemptive lower bound. (Clearly, the cost of an optimal preemptive tour is at most that of 
an optimal non-preemptive tour.) A 2.5-approximation algorithm for single source version of Dial-a-Ride 
(also called the "capacitated vehicle routing" problem) was given by Haimovich & Kan MHK851 : again, 
their algorithm output a non-preemptive tour with length at most 2.5 times the preemptive lower bound. 
For the preemptive Dial-a-Ride problem, Charikar & Raghavachari [CR98J gave the current-best O(logn) 
approximation algorithm, and G0rtz Hrtz061 showed that it is hard to approximate this problem to better than 
^(log 1 / 4- ^ n). Recall that no super-constant hardness results are known for the non-preemptive Dial-a-Ride 
problem. 

2 The /c-forest problem 

In this section, we study the /c-forest problem, and give an approximation guarantee of 0(mm{^/n, Vk}). 
This result improves upon the previous best 0(n 2 / 3 log n) -approximation guarantee HSS06H for this problem. 
The algorithm in Segev & Segev ISS061 is based on a Lagrangian relaxation for this problem, and suitably 
combining solutions to this relaxation. In contrast, our algorithm uses a more direct approach and is much 
simpler in description. Our approach is based on approximating the following "density" variant of /c-forest. 



Definition 5 (Minimum-ratio /c-forest) Given an n-vertex metric space (V, d), m pairs of vertices {sj, ti}™ 
and a target k, find a tree T that connects at most k pairs, and minimizes the ratio of the length ofT to the 
number of pairs connected in 

We present two different algorithms for minimum-ratio k-forest, obtaining approximation guarantees of 
0(\fk) (Section |2~TT) and 0(^/n) (Section f22\i ; these are then combined to give the claimed result for the 

2 Even if we relax the solution to be any forest, we may assume (by averaging) that the optimal ratio solution is a tree. 
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/c-forest problem. Both our algorithms are based on subtle reductions to the /c-MST problem, albeit in very 
different ways. 

As is usual, when we say that our algorithm guesses a parameter in the following discussion, it means 
that the algorithm is run for each possible value of that parameter, and the best solution found over all the 
runs is returned. As long as only a constant number of parameters are being guessed and the number of 
possibilities for each of these parameters is polynomial, the algorithm is repeated only a polynomial number 
of times. 

2.1 An O(Vk) approximation algorithm 

In this section, we give an 0{s/k) approximation algorithm for minimum ratio fc-forest, which is based on a 
simple reduction to the fc-MST problem. The basic intuition is to look at the solution S to minimum-ratio k- 
forest and consider an Euler tour of this tree S — a theorem of Erdos & Szekeres on increasing subsequences 
implies that there must be at least sources which are visited in the same order as the corresponding 

sinks. We use this existence result to combine the source-sink pairs to create an instance of y^jSf-MST from 
which we can obtain a good-ratio tree; the details follow. 

Let S denote an optimal ratio tree, that covers q demands & has length B, and let D denote the largest 
distance between any demand pair that is covered in S (note D < B). We define a new metric I on the set 
{1, • • • , m} of demands as follows. The distance between demands i and j, lij = d(si, Sj)+d(ti,tj), where 
(V,d) is the original metric. The 0(yfk) approximation algorithm first guesses the number of demands q 
& the largest demand-pair distance D in the optimal tree S (there are at most m choices for each of q & 
D). The algorithm discards all demand pairs (sj,£j) such that d(sj,tj) > D (all the pairs covered in the 
optimal solution S still remain). Then the algorithm runs the unrooted fc-MST algorithm MGar051 with target 
[a/^J > i n tne me tric I, to obtain a tree T on the demand pairs P. From T, we easily obtain trees T\ (on all 
sources in P) and T2 (on all sinks in P) in metric d such that d(T\) + diT^) = l(T). Finally the algorithm 
outputs the tree T' = T\ U T2 U {e}, where e is any edge joining a source in T\ to its corresponding 
sink in T2. Due to the pruning on demand pairs that have large distance, d(e) < D and the length of T', 
d(T') < 1{T)+D < l(T) + B. 

We now argue that the cost of the solution T found by the fc-MST algorithm l(T) < 8B. Consider the 
optimal ratio tree S (in metric d) that has q demands {(si, t\), ■ ■ ■ , (s q , t q )}, and let r denote an Euler tour 
of S. Suppose that in a traversal of r, the sources of demands in S are seen in the order si, • • • ,s q . Then in 
the same traversal, the sinks of demands in S will be seen in the order , • • • , t 7r ( ? ) , for some permutation 
7r. The following fact is well known (see, e.g., MSte9510 . 

Theorem 6 (Erdos & Szekeres) Every permutation on {1, • • • , q} has either an increasing subsequence of 
length Ly^J or a decreasing subsequence of length \_y/q\. 

Using Theorem[6l we obtain a set M of p = [^/q\ demands such that (1) the sources in M appear in increas- 
ing order in a traversal of the Euler tour r, and (2) the sinks in M appear in increasing order in a traversal of 
either r or t r (the reverse traversal of r). Let jo < Ji < • • • < jp-i denote the demands in M in increasing 
order. From statement (1) above, X^=o d(s(ji), < d(r), where the indices in the summation are 

modulo p. Similarly, statement (2) implies that Y^iZo d(t(ji),t(ji+i)) < max{d(r), d(r R )} = d{r). Thus 
we obtain: 

p-i 

l>00'i), + < 2d(r) < 4B 

But this sum is precisely the length of the tour jo, ji , ■ ■ ■ , j p -i, jo in metric I. In other words, there is a tree 
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of length AB in metric I, that contains [^/q\ vertices. So, the cost of the solution T found by the /c-MST 
approximation algorithm is at most 8B. 

Now the final solution T has length at most l(T) + B < 9B, and ratio that at most 9y/qj < 9Vkj. 

Thus we have an 0(y/k) approximation algorithm for minimum ratio fc-forest. 

2.2 An 0(y/n) approximation algorithm 

In this section, we show an 0(y/n) approximation algorithm for the minimum ratio /c-forest problem. The 
approach is again to reduce to the /c-MST problem; the intuition is rather different: either we find a vertex 
v such that a large number of demand-pairs of the form (v, *) can be satisfied using a small tree (the "high- 
degree" case); if no such vertex exists, we show that a repeated greedy procedure would cover most vertices 
without paying too much (and since we are in the "low-degree" case, covering most vertices implies covering 
most demands too). The details follow. 

Let S denote an optimal solution to minimum ratio /c-forest, and q < k the number of demand pairs 
covered in S. We define the degree A of S to be the maximum number of demands (among those covered in 
S) that are incident at any vertex in S. The algorithm first guesses the following parameters of the optimal 
solution S: its length B (within a factor 2), the number of pairs covered q, the degree A, and the vertex 
w G S that has A demands incident at it. Although, there may be an exponential number of choices for 
the optimal length, a polynomial number of guesses within a binary-search suffice to get a B such that 
B < d(S) < 2 • B. The algorithm then returns the better of the two procedures described below. 
Procedure 1 (high-degree case): Since the degree of vertex w in the optimal solution S is A, there is tree 
rooted at w of length d(S) < 2B, that contains at least A demands having one end point at w. We assign 
a weight to each vertex u, equal to the number of demands that have one end point at this vertex u and the 
other end point at w. Then we run the /c-MST algorithm MGar051 with root w and a target weight of A. By 
the preceding argument, this problem has a feasible solution of length 2B; so we obtain a solution H of 
length at most AB (since the algorithm of HGar051 is a 2-approximation). The ratio of solution H is thus at 
most AB/A = 

Procedure 2 (low-degree case): Set t = note that q < and so t < n/A. We maintain a current tree 
T (initially just vertex w), which is updated in iterations as follows: shrink T to a supernode s, and run the 
/c-MST algorithm with root s and a target of t new vertices. If the resulting s-tree has length at most AB, 
include this tree in the current tree T and continue. If the resulting s-tree has length more than AB, or if all 
the vertices have been included, the procedure ends. Since t new vertices are added in each iteration, the 
number of iterations is at most ~; so the length of T is at most ^r-B. We now show that T contains at least 
| demands. Consider the set S \ T (recall, S is the optimal solution). It is clear that \S \ T\ < t; otherwise 
the /c-MST instance in the last iteration (with the current T) would have S as a feasible solution of length 
< 2B (and hence would find one of length at most AB). So the number of demands covered in S that have 
at least one end point in S \ T is at most \S \ T\ ■ A < t ■ A = q/2 (as A is the degree of solution S). Thus 
there are at least q/2 demands contained in S n T, in particular in T. Thus T is a solution having ratio at 
most ^5- 2 = 

t q t q 

The better ratio solution among H and T from the two procedures has ratio at most min{^, ^} • = 

min{8t, ^} • < 8-y/n • < 8y/n ■ ^p-- So this algorithm is an Oi^/n) approximation to the minimum 
ratio /c-forest problem. 

2.3 Approximation algorithm for /c-forest 

Given the two algorithms for minimum ratio /c-forest, we can use them in a standard greedy fashion (i.e., 
keep picking approximately minimum-ratio solutions until we obtain a forest connecting at least k pairs); 
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the standard set cover analysis can be used to show an 0(mm{^/n, \fk] • log /^-approximation guarantee 
for fc-forest. A tighter analysis of the greedy algorithm (as done, e.g., in Charikar et al. |CCyC + 98 1) can be 
used to remove the logarithmic terms and obtain the guarantee stated in Theorem [2] 



3 Applications to Dial-a-Ride problems 

In this section, we study applications of the /c-forest problem to the Dial-a-Ride problem (Definition [3]), and 
some generalizations. A natural solution-structure for Dial-a-Ride involves servicing demands in batches 
of at most k each, where a batch consisting of a set S of demands is served as follows: the vehicle starts 
out being empty, picks up each of the \S\ < k objects from their sources, then drops off each object at its 
destination, and is again empty at the end. If we knew that the optimal solution has this structure, we could 
obtain a greedy framework for Dial-a-Ride by repeatedly finding the best 'batch' of k demands. However, 
the optimal solution may involve carrying almost k objects at every point in the tour, in which case it can 
not be decomposed to be of the above structure. In Theorem [TJ we show that there is always a near optimal 
solution having this 'pick-drop in batches' structure. Building on Theorem [7J we obtain approximation 
algorithms for the classical Dial-a-Ride problem (Section [3TTb . and two interesting extensions: non-uniform 
Dial-a-Ride (Section ED & weighted Dial-a-Ride (Section [33]>. 

Theorem 7 (Structure Theorem) Given any instance of Dial-a-Ride, there exists a feasible tour r satisfy- 
ing the following conditions: 

1. t can be split into a set of segments {S±, ■ ■ ■ ,St} (i.e., r = Si ■ S% - ■ ■ St) where each segment Si 
services a set Oi of at most k demands such that Si is a path that first picks up each demand in Oi 
and then drops each of them. 

2. The length ofr is at most 0(log m) times the length of an optimal tour. 

Proof: Consider an optimal non-preemptive tour a: let c(a) denote its length, and \a\ denote the number 
of edge traversals in a. Note that if in some visit to a vertex v in a there is no pick-up or drop-off, then 
the tour can be short-cut over vertex v, and it still remains feasible. Further, due to triangle inequality, the 
length c(a) does not increase by this operation. So we may assume that each vertex visit in a involves a 
pick-up or drop-off of some object. Since there is exactly one pick-up & drop-off for each object, we have 
\a\ < 2m + 1. Define the stretch of a demand i to be the number of edge traversals in a between the 
pick-up and drop-off of object i. The demands are partitioned as follows: for each j = 1, • • • , [log (2m)], 
group Gj consists of all the demands whose stretch lie in the interval [2? , 2 J ). We consider each group 
Gj separately. 

Claim 8 For each j = 1, • • • , [~log(2m)~|, there is a tourTj that serves all the demands in group Gj, satisfies 
condition 1 of Theorem^ and has length at most 6 • c(o~). 

Proof: Consider tour a as a line C, with every edge traversal in a represented by a distinct edge in C. 
Number the vertices in C from to h, where h = \a\ is the number of edge traversals in a. Note that each 
vertex in V may be represented multiple times in C. Each demand is associated with the numbers of the 
vertices (in C) where it is picked up & dropped off. 

Let r = 2- 7-1 , and partition Gj as follows: for I = 1, • • • , [-], set O/j consists of all demands in Gj 
that are picked up at a vertex numbered between (I — l)r and Ir — 1. Since every demand in Gj has stretch 
in the interval [r, 2r], every demand in Oj j is dropped off at a vertex numbered between Ir and (I + 2)r — 1. 
Note that |0/j| equals the number of demands in Gj carried over edge (Ir — 1, Ir) by tour a, which is at 
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most k. We define segment Sij to start at vertex number (Z — l)r and traverse all edges in C until vertex 
number (Z + 2)r — 1 (servicing all demands in Oi j by first picking up each demand between vertices (Z — l)r 
& Ir — 1; then dropping off each demand between vertices Ir & (Z + 2)r — 1), and then return (with the 
vehicle being empty) to vertex Ir. Clearly, the number of objects carried over any edge in Sij is at most the 
number carried over the corresponding edge traversal in a. Also, each edge in C participates in at most 3 
segments Sij, and each edge is traversed at most twice in any segment. So the total length of all segments 
Si j is at most 6 • c(a). We define tour Tj to be the concatenation Sij • • • Srh/ r ~\,j- It is clear that this tour 
satisfies condition 1 of Theorem |7jH 

Applying this claim to each group Gj, and concatenating the resulting tours, we obtain the tour r satis- 
fying condition 1 and having length at most 61og(2m) • c(a) = O(logm) • c(a)M 

Remark: The ratio 0(log m) in Theorem|7Jis almost best possible. There are instances of Dial-a-Ride (even 
on an unweighted line), where every solution satisfying condition 1 of Theorem [7] has length at least 
S7(max{ io°f™ m , i^p:}) times the optimal non-preemptive tour. So, if we only use solutions of this structure, 
then it is not possible to obtain an approximation factor (just in terms of capacity k) for Dial-a-Ride that 
is better than Q(k/logk). The solutions found by the algorithm for Dial-a-Ride in MCR98I also satisfy 
condition 1 of Theorem [7j It is interesting to note that when the underlying metric is a hierarchically well- 
separated tree, MCR98I obtain a solution of such structure having length 0{yk) times the optimum, whereas 
there is a lower bound of ^(j^p:) even for the simple case of an unweighted line. 

3.1 Classical Dial-a-Ride 

Theorem [7j suggests a greedy strategy for Dial-a-Ride, based on repeatedly finding the best batch of k 
demands to service. This greedy subproblem turns out to be the minimum ratio /c-forest problem (Defini- 
tion [5]), for which we already have an approximation algorithm. The next theorem sets up the reduction from 
/c-forest to Dial-a-Ride. 

Theorem 9 (Reducing Dial-a-Ride to minimum ratio /c-forest) A p-approximation algorithm for mini- 
mum ratio k-forest implies an 0(p log 2 m)-approximation algorithm for Dial-a-Ride. 

Proof: The algorithm for Dial-a-Ride is as follows. 

1. C = 0. 

2. Until there are no uncovered demands, do: 

(a) Solve the minimum ratio fc-forest problem, to obtain a tree C covering kc < k new demands. 

(b) Set C <- C U C. 

3. For each tree C E C, obtain an Euler tour on C to locally service all demands (pick up all kc objects in the first 
traversal, and drop them all in the second traversal). Then use a 1.5-approximate TSP tour on the sources, to 
connect all the local tours, and obtain a feasible non-preemptive tour. 

Consider the tour r and its segments as in Theorem UJ If the number of uncovered demands in some 
iteration is m', one of the segments in r is a solution to the minimum ratio /c-forest problem of value at 
most Since we have a p-approximation algorithm for this problem, we would find a segment of ratio 

at most O(p) ■ Now a standard set cover type argument shows that the total length of trees in C is 
at most O(plogm) • d(r) < 0(plog 2 m) ■ OPT, where OPT is the optimal value of the Dial-a-Ride 
instance. Further, the TSP tour on all sources is a lower bound on OPT, and we use a 1.5-approximate 
solution MChr77j So the final non-preemptive tour output in step 5 above has length at most 0(plog 2 m) ■ 
OPT.U 
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This theorem is in fact stronger than Theorem [4] claimed earlier: it is easy to see that any approximation 
algorithm for /c-forest implies an algorithm with the same guarantee for minimum ratio /c-forest. Note that, 
m and k may be super-polynomial in n. However, we show in Section 1331 that with the loss of a constant 
factor, the general Dial-a-Ride problem can be reduced to one where the number of demands m < n 4 . 
Based on this and Theorem |9j a p approximation algorithm for minimum ratio /c-forest actually implies 
an 0(p log 2 n) approximation algorithm for Dial-a-Ride. Using the approximation algorithm for minimum 
ratio /c-forest (Section©, we obtain an 0(m.m{^/n, ^fk} ■ log 2 n) approximation algorithm for the Dial-a- 
Ride problem. 

Remark: If we use the O(Vk) approximation for /c-forest, the resulting non-preemptive tour is in fact 
feasible even for a \fk capacity vehicle! As noted in MCR98II . this property is also true of their algorithm, 
which is based on an entirely different approach. 

3.2 Non-uniform Dial-a-Ride 

The greedy framework for Dial-a-Ride described above is actually more generally applicable than to just the 
classical Dial-a-Ride problem. In this section, we consider the Dial-a-Ride problem under a substantially 
more general class of cost functions, and show how the /c-forest problem can be used to obtain an approx- 
imation algorithm for this generalization as well. In fact, the approximation guarantee we obtain by this 
approach matches the best known for the classical Dial-a-Ride problem. Our framework for Dial-a-Ride 
is well suited for such a generalization since it is a 'primal' approach, based on directly approximating a 
near-optimal solution; this approach is not too sensitive to the cost function. On the other hand, the Charikar 
& Raghavachari MCR981 algorithm is a 'dual' approach, based on obtaining a good lower bound, which 
depends heavily on the cost function. Thus it is unclear whether their techniques can be extended to handle 
such a generalization. 

Definition 10 (Non-uniform Dial-a-Ride) Given an n vertex undirected graph G = (V, E), a root vertex 
r, a set of m demands {{s%, ii)}^, and a non-decreasing cost function c e : {0, 1, • • • , m} — > K + on each 
edge e £ E (where c e (l) is the cost incurred by the vehicle in traversing edge e while carrying I objects), 
find a non-preemptive tour (starting & ending at r) of minimum total cost that moves each object ifrom s, 
to ti. 

Note that the classical Dial-a-Ride problem is a special case when the edge costs are given by: c e (Z) = d e 
if I < k & c e (l) = oo otherwise, where d e is the edge length in the underlying metric. We may assume 
(without loss in generality) that for any fixed value I £ [0, m], the edge costs c e (Z) induce a metric on V. 
Similar to Theorem |7j we have a near optimal solution with a 'batch' structure for the non-uniform Dial-a- 
Ride problem as well, which implies the algorithm in Theorem [T2j The proof of the following corollary is 
almost identical to that of Theorem [7j and is omitted. 

Corollary 11 (Non-uniform Structure Theorem) Given any instance of non-uniform Dial-a-Ride, there 
exists a feasible tour t satisfying the following conditions: 

1. t can be split into a set of segments {Si, ■ ■ ■ , St} (i.e., r = Si ■ S2 • • • St) where each segment Si 
services a set Oi of demands such that Si is a path that first picks up each demand in Oj and then 
drops each of them. 

2. The cost ofr is at most 0(log m) times the cost of an optimal tour. 
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Theorem 12 (Approximating non-uniform Dial-a-Ride) A p-approximation algorithm for minimum ra- 
tio k-forest implies an 0(plog 2 m)-approximation algorithm for non-uniform Dial-a-Ride. In particular, 
there is an 0(y/n\og 2 m)-approximation algorithm. 

Proof: Corollary QTJagain suggests a greedy algorithm for non-uniform Dial-a-Ride based on the following 
greedy subproblem: find a set T of uncovered demands and a path to that first picks up each object in T 
and then drops off each of them, such that the ratio of the cost of to to |T| is minimized. However, unlike 
in the classical Dial-a-Ride problem, in this case the cost of path to does not come from a single metric. 
Nevertheless, the minimum ratio /c-forest problem can be used to solve this subproblem as follows. 

1. For every k = 1, ■ • ■ , m: 

(a) Define length function de = c e (k) on the edges. 

(b) Solve the minimum ratio fc-forest problem on metric (V, d^) with target k, to obtain tree TL covering 
rife < k demands. 

(c) Obtain an Euler tour Tft of T' k that services these demands, by picking up all demands in one traversal 
and then dropping them all in a second traversal. 

2. Return the tour Tk having the smallest ratio ^ (over all 1 < k < to). 

Assuming a p-approximation algorithm for minimum ratio /c-forest (for all values of k), we now show 
that the above algorithm obtains a 16p-approximate solution to the greedy subproblem. The cost of tour 
in step 3 is c(T k ) < 4 • dS k \T' k ) , since Tf. involves traversing a tour on tree Ti twice and the vehicle carries 

at most < k objects at every point in T^. So the ratio of tour T% is < 4 — = 4 • ratio (T' k ). 
Let t denote the optimal path for the greedy subproblem, T the set of demands that it services, and t = \T\. 
Let T\ denote the last |t demands that are picked up, and T 2 denote the first |i demands that are dropped 
off. It is clear that T = T x n T 2 has at least t/2 demands; let T" C T' be any subset with \T"\ = t/A. 
Let t' denote the portion of t between the |-th pick up and the ^-th drop off. Note that when path r 
is traversed, there are at least | objects in the vehicle while traversing each edge in t' . So the cost of t, 
c ( r ) — Seer' c e(^/4)- Since r' contains the end points of all demands in T" D T", it is a feasible solution 
(covering the demands T") to minimum ratio fc-forest with target k = t/A in the metric d^/ A \ having ratio 
(Seer' c e{t/^))/\ < ^zl § the ratio of tour T t / 4 (obtained from the ^-approximate tree T ( '; 4 ) is at most 

4-ratio(T£) < Ap^fi = 16p^M. Thus we have a 16p-approximation algorithm for the greedy subproblem. 

Based on Corollary[TH it can now be shown (as in Theorem© that a //-approximation algorithm for the 
greedy subproblem implies an 0(p' ■ log 2 m)-approximation algorithm for non-uniform Dial-a-Ride. Using 
the above 16/3-approximation for the greedy subproblem, we have the theorem. ■ 

3.3 Weighted Dial-a-Ride 

So far we worked with the unweighted version of Dial-a-Ride, where each object has the same weight. In 
this section, we extend our greedy framework for Dial-a-Ride to the case when objects have different sizes, 
and the total size of objects in the vehicle must be bounded by the vehicle capacity. Here we only extend the 
classical Dial-a-Ride problem and not the generalization of Section [3^21 The problem studied in this section 
has been studied earlier as the pickup and delivery problem MSS95II . 

Definition 13 (Weighted Dial-a-Ride) Given a vehicle of capacity Q G N, an n-vertex metric space (V, d), 
a root vertex r, and a set of m objects {(sj, ti, Wi)} 7 //L l (with object i having source Si, destination ti & an 
integer size 1 < w.i < Q), find a minimum length (non-preemptive) tour of the vehicle starting (and ending) 
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at r that moves each object i from its source to its destination such that the total size of objects carried by 
the vehicle is at most Q at any point on the tour. 

The classical Dial-a-Ride problem is a special case when Wi = 1 for all demands and the vehicle 
capacity Q = k. The following are two lower bounds for weighted Dial-a-Ride: a TSP tour on the set of 
all sources & destinations (Steiner lower bound); and YliLi Wl ' d ^' t ^ (flow lower bound). In fact, as can be 
seen easily, these two lower bounds are valid even for the preemptive version of weighted Dial-a-Ride; so 
they are termed preemptive lower bounds. 

The main result of this section (Theorem [T5T ) reduces weighted Dial-a-Ride to the classical Dial-a- 
Ride problem with the additional property that the number of demands (m) is small (polynomial in the 
number of vertices n). This shows that in order to approximate weighted Dial-a-Ride, it suffices to consider 
instances of the classical Dial-a-Ride problem with a small number of demands. The next lemma shows 
that even if the vehicle is allowed to split each object over multiple deliveries, the resulting tour is not 
much shorter than the tour where each object is required to be served in a single delivery (as is the case in 
weighted Dial-a-Ride). This lemma is the main ingredient in the proof of Theorem[l5] In the following, for 
any instance of weighted Dial-a-Ride, we define the unweighted instance corresponding to it as a classical 
Dial-a-Ride instance with vehicle capacity Q, and Wi (unweighted) demands each having source and 
destination t{ (for each 1 < i < m). 

Lemma 14 Given any instance X of weighted Dial-a-Ride, and a solution r to the unweighted instance 
corresponding to X, there is a polynomial time computable solution to X having length at most O(l) • d(r). 

Proof: Let J denote the unweighted instance corresponding to X. Define line «Sf as in the proof of Theo- 
rem[7]by traversing r from r: for every edge traversal in r, add a new edge of the same length at the end of 
Jz? . For each unweighted object in J corresponding to demand i in X, there is a segment in r (correspond- 
ingly in «Sf) where it is moved from Sj to t{. So each demand i G X corresponds to Wi segments in r (each 
being a path from to tj). For each demand i in X, we assign i to one of its W{ segments picked uniformly 
at random: call this segment £j. For an edge e G JSf, let N e = Yli-eti- Wi d enote trie random variable which 
equals the total weight of demands whose assigned segments contain e. Note that the expected value of N e 
is exactly the number of unweighted objects carried by r when traversing the edge corresponding to e. Since 
r is a feasible tour for J, E[N e ] < Q for all e G Jgf. 

Consider a random instance TZ of Dial-a-Ride on line J£ with vehicle capacity Q and demands as 
follows: for each demand i in X, an object of weight Wi is to be moved along segment l{ (chosen randomly as 
above). Clearly, any feasible tour for TZ corresponds to a feasible tour for X of the same length. Note that the 
flow lower bound for instance TZ is F = X^eeJf \%~~\ > an d the Steiner lower bound is J^eei? ^ e = ^( T )- 

Using linearity of expectation, E[F] < X^es-S? d e ( E ^Q e ^ + 1) < 2 • d{r). Let R* denote the instance (on 
line J2f ) obtained by assigning each demand i in X to its shortest length segment (among the Wi segments 
corresponding to it). Clearly this assignment minimizes the flow lower bound (over all assignments of 
demands to segments). So R* has flow bound < E[F] < 2 • d(r), and Steiner lower bound d(r). 

Finally, we note that the 3-approximation algorithm for Dial-a-Ride on a line HKRW001 extends to 
a constant factor approximation algorithm for the case with weighted demands as well (this can be seen 
directly from MKRWOOI ). Additionally, this approximation guarantee is relative to the preemptive lower 
bounds. Thus, using this algorithm on R*, we obtain a feasible solution to X of length at most 0(1) • d(r)M 

Theorem 15 (Weighted Dial-a-Ride to unweighted) Suppose there is a p-approximation algorithm for in- 
stances of classical Dial-a-Ride with at most 0(n ) demands. Then there is an 0(p) -approximation algo- 
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rithm for weighted Dial-a-Ride (with any number of demands). In particular, there is an 0(T/nlog 2 n) 
approximation for weighted Dial-a-Ride. 

Proof: Let X denote an instance of weighted Dial-a-Ride with objects {(wi, Si,ti) : 1 < i < m}, and r* an 
optimal tour for X. Let V = {(si, h), • • • , (s/, ti)} be the distinct pairs of vertices that have some demand 
between them, and let T* denote the total size of all objects having source Si and destination t,- L . Note that 
I < n(n - 1). Let V high = {% G V : T t > % }, V iow = {i E V : T { < % and = V \ {V htgh U 7\™). 
We now show how to separately service objects in Vi ow , Thigh & V '■ 

Servicing Vi ow : The total size in V[ ow is at most Q; so we can service all these pairs by traversing a 
single 1.5-approximate tour MChr77l on the sources and destinations. Note that the length of this tour is at 
most 1.5 times the Steiner lower bound, hence at most 1.5 • d(r*). 

Servicing Vhigh- Let C be a 1.5-approximate minimum tour on all the sources. The pairs in Vhigh are 
serviced by a tour n as follows. Traverse along C, and when a source s, in Vhigh is visited, traverse the 
direct edge to the corresponding destination ti & back, as few times as possible so as to move all the objects 
between Sj and U, as described next. Note that every object to be moved between s, and ti has size (the 
original W{ size) at most Q, and the total size of such objects Tj > Q/2. So these objects can be partitioned 
such that the size of each part (except possibly the last) is in the interval [S, Q], So the number of times 
edge (si,ti) is traversed to service the demands between them is at most 2|~^p-] < 2(^- + 1) < 8^-. Now, 

the length of tour n is at most d(C) + Y^( Si ,ti)eT high M s i.*i)§ < d ( C ) + 8 HT=i m ' d( Q l ' U) . Note that 
d(C) is at most 1.5 times the minimum tour on all sources (Steiner lower bound), and the second term above 
is the flow lower bound. So tour n has length at most O(l) times the preemptive lower bounds for X, which 
is at most 0(1) • d{r*). 

Servicing V': We know that the total size Tj of each pair i in V' lies in the interval (Q/l,Q/2). Let X' 
denote the instance of weighted Dial-a-Ride with demands {(si,ti,Ti) : i G V'} and vehicle capacity Q; 
note that the number of demands in X' is at most I. The tour r* restricted to the objects corresponding to 
pairs in V' is a feasible solution to the unweighted instance corresponding to X' (but it may not feasible for 
X' itself). However Lemma [141 implies that the optimal value of X', opt(X) < 0(1) • d(r*). 

Next we reduce instance X' to an instance J of weighted Dial-a-Ride satisfying the following conditions: 
(i) J has at most I demands, (ii) each object in X has size at most 21, (iii) any feasible solution to J is feasible 
for X ', and (iv) the optimal value opt(J) < 0(1) • opt(X'). If Q < 21, J = X' itself satisfies the required 
conditions. Suppose Q > 21, then define p = [yj; note that Q > I ■ p > Q — I > $. Round up each 
size Ti to the smallest integral multiple T[ of p, and round down the capacity Q to Q' = I ■ p. Since each 
size Ti € (y , y ), all sizes T- G {p, 2p, ■ ■ ■ , lp}. Now let X" denote the weighted Dial-a-Ride instance with 
demands {(si,ti,T-) : i G V'} and vehicle capacity Q 1 = lp. One can obtain a feasible solution for X" 
from any feasible solution a for X' by traversing a a constant number of times: this follows from Q' > Q 
& T[ < max{2Tj, Q'}E So the optimal value of X" is at most 0(1) ■ opt(X'). Now note that all sizes and 
the vehicle capacity in X" are multiples of p; scaling down each of these quantities by p, we get an instance 
J equivalent to X" where the vehicle capacity is I (and every demand size is at most I). This instance J 
satisfies all the four conditions claimed above. 

Now observe that the instance J can be solved using p-approximation algorithm assumed in the the- 
orem. Since J has at most I demands (each of size < 21), the unweighted instance corresponding to J 

3 In particular, consider simulating a traversal along a of a capacity Q vehicle (To) by 8 capacity Q' vehicles T[, ■ ■ ■ , Tg, each 
running in parallel along a. Whenever vehicle To picks-up an object i, one of the vehicles {Tg}g =1 picks-up i: if u>i < ®, any 
vehicle {T^}g =1 that has free capacity picks-up i; if Wi > ^, any vehicle {Tg}g= 5 that is empty picks-up i. It is easy to see that 
if at some point none of the vehicles {Tj}g =1 picks-up an object, there must be a capacity violation in To. 
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has at most 2l 2 < 2re 4 demands. Thus, this unweighted instance can be solved using the p-approximation 
algorithm for such instances, assumed in the theorem. Then using the algorithm in Lemma [EU we obtain a 
solution to J, of length at most O(p) • opt{J) < O(p) ■ opt (I') < O(p) • d(r*). Since any feasible solution 
to J corresponds to one for X' , we have a tour servicing V' of length at most O(p) • d(r*). 

Finally, combining the tours servicing Vi ow , Vhigh &V',v/e obtain a feasible tour for T having length 
O(p) ■ d{r*), which gives us the desired approximation algorithm. ■ 

Theorem [131 also justifies the assumption log m = 0(log n) made at the end of Section[3] This is important 
because in general m may be super-polynomial in n. 

4 The Effect of Preemptions 

In this section, we study the effect of the number of preemptions in the Dial-a-Ride problem. We mentioned 
two versions of the Dial-a-Ride problem (Definition [3]>: in the preemptive version, an object may be pre- 
empted any number of times, and in the non-preemptive version objects are not allowed to be preempted 
even once. Clearly the preemptive version is least restrictive and the non-preemptive version is most restric- 
tive. One may consider other versions of the Dial-a-Ride problem, where there is a specified upper bound P 
on the number of times an object can be preempted. Note that the case P = is the non-preemptive version, 
and the case P = n is the preemptive version. We show that for any instance of the Dial-a-Ride prob- 
lem, there is a tour that preempts each object at most once (i.e., P = 1) and has length at most 0(log 2 n) 
times an optimal preemptive tour (i.e., P = n). This implies that the real gap between preemptive and 
non-preemptive tours is between zero and one preemption per object. A tour that preempts each object at 
most once is called a 1-preemptive tour. 

Theorem 16 (Many preemptions to one preemption) Given any instance of the Dial-a-Ride problem, there 
is a 1-preemptive tour of length at most 0(log 2 n) • OPT pm u where OPT pm t is the length of an optimal 
preemptive tour. Such a tour can be found in randomized polynomial time. 

Proof: Using the results on probabilistic tree embedding MFRT0311 . we may assume that the given metric is 
a hierarchically well-separated tree T. This only increases the expected length of the optimal solution by 
a factor of O(logn). Further, tree T has 0(log d J nax ) levels, where d max and d m i n denote the maximum 
and minimum distances in the original metric. We first observe that using standard scaling arguments, it 
suffices to assume that ^p^- is polynomial in n. Without loss of generality, any preemptive tour involves at 
most 2m • n edge traversals: each object is picked or dropped at most 2n times (once at each vertex), and 
every visit to a vertex involves picking or dropping at least one object (otherwise the tour can be shortcut 
over this vertex at no increase in length). By retaining only vertices within distance OPT pmt /2 from the 
root r, we preserve the optimal preemptive tour and ensure that d max < OPT pmt . Now consider modifying 
the original metric by setting all edges of length smaller than OPT pmt /2mn 3 to length 0; the new distances 

OPT + 

are shortest paths under the modified edge lengths. So any pairwise distance decreases by at most 2 mn™ • 
Clearly the length of the optimal preemptive tour only decreases under this modification. Since there are at 
most 2mn edge traversals in any preemptive tour, the increase in tour length in going from the new metric to 

OPT i O PT t 

the original metric is at most 2mn ■ 2m n™ — — n 22-- Thus at the loss of a constant factor, we may assume 
that dmax/dmin < 2mv? . Further, the reduction in Theorem [141 also holds for preemptive Dial-a-Ride; so 
we may assume (at the loss of an additional constant factor) that the number of demands m < 0(n 4 ). So 
we have dmax/dmin < 0(n ) and hence tree T has 0(log n) levels. 

The tree T resulting from the probabilistic embedding has several Steiner vertices that are not present in 
the original metric; so the tour that we find on T may actually preempt objects at Steiner vertices, in which 
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case it is not feasible in the original metric. However as shown by Gupta [GupOl |, these Steiner vertices 
can be simulated by vertices in the original metric (at the loss of a constant factor). Based on the preceding 
observations, we assume that the metric is a tree T on the original vertex set having I = O(logn) levels, 
such that the expected length of the optimal preemptive tour is O(logn) • OPT pmt . 

We now partition the demands in T into I sets with Di (for i = 1 , • • • , I) consisting of all demands 
having their least common ancestor (lea) in level i. We service each Di separately using a tour of length 
0{OPT pmt ). Then concatenating the tours for each level i, we obtain the theorem. 

Servicing D^. For each vertex v at level i in T, let L v denote the demands in Di that have v as their 
lea. Consider an optimal preemptive tour that services the demands D{. Since the subtrees under any 
two different level i vertices are disjoint and there is no demand in Di across such subtrees, we may as- 
sume that this optimal tour is a concatenation of disjoint preemptive tours servicing each L v separately. If 
OPT pmt (v) denotes the length of an optimal preemptive tour servicing L v with v as the starting vertex, 
Z v OPT pmt {v) < OPT pmt . 

Now consider an optimal preemptive tour r v servicing L v . Since the Sj — tj path of each demand j £ L v 
crosses vertex v, at some point in tour t v the vehicle is at v with object j in it. Consider the tour a v obtained 
by modifying t v so that it drops each object j at v when the vehicle is at v with object j in it. Clearly 
d(a v ) = d{r v ) = OPT pmt {v). Note that a v is a feasible preemptive tour for the single source Dial-a- 
Ride problem with sink v and all sources in L v . Thus the algorithm of BHK851 gives a non-preemptive tour 
a' v that moves all objects in L v from their sources to v, having length at most 2.5d(a v ) = 2.hOPT pmt {v). 
Similarly, we can obtain a non-preemptive tour cr" that moves all objects in L v from v to their destinations, 
having length at most 2.50PT prnt (v). Now a' v • a" is a 1-preemptive tour servicing L v of length at most 
5 • OPT pmt {v). 

We now run a DFS on T to visit all vertices in level i, and use the algorithm described above for 
servicing demands L v when v is visited in the DFS. This results in a tour servicing Di, having length at 
most 2d(T) + 5J2 V OPT pmt (v). Here 2d(T) is the Steiner lower bound, and OPT pmt (v) < OPT pmt . 
Thus the tour servicing Di has length at most 6 • OPT pmt . 

Finally concatenating the tours for each level i = 1, we obtain a 1-preemptive tour on T of 

length O(logn) • OPTpmt, which translates to a 1-preemptive tour on the original metric having length 
O (log 2 n) -OPTpmt. M 

Motivated by obtaining an improved approximation for Dial-a-Ride on the Euclidean plane, we next 
consider the worst case gap between an optimal non-preemptive tour and the preemptive lower bounds. As 
mentioned earlier, HCR981I showed that there are instances of Dial-a-Ride where the ratio of the optimal non- 
preemptive tour to the optimal preemptive tour is Sl^i 1 / 3 ). However, the metric involved in this example 
was the uniform metric on n points, which can not be embedded in the Euclidean plane. The following 
theorem shows that even in this special case, there can be a polynomial gap between non-preemptive and 
preemptive tours, and implies that just preemptive lower bounds do not suffice to obtain a poly-logarithmic 
approximation guarantee. 

Theorem 17 (Preemption gap in Euclidean plane) There are instances of Dial-a-Ride on the Euclidean 

1/8 

plane where the optimal non-preemptive tour has length ^( j" a - ) times the optimal preemptive tour. 

Proof: Consider a square of side 1 in the Euclidean plane, in which a set of n demand pairs are distributed 
uniformly at random (each demand point is generated independently and is distributed uniformly at random 
in the square). The vehicle capacity is set to k = \fn. Let 1Z denote a random instance of Dial-a-Ride ob- 
tained as above. We show that in this case, the optimal non-preemptive tour has length ^(ra 1 / 8 ) with high 
probability. We first show the following claim. 
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Claim 18 The minimum length of a tree containing k pairs in 1Z is Q(^g^), w.h.p. 

Proof: Take any set S of k = yjn demand pairs. Note that the number of such sets S is (?). This set S has 
2k points each of them generated uniformly at random. It is known that there are pP~ 2 different labeled trees 
on p vertices (see e.g. MvLW92| . Ch.2). The term labeled emphasizes that we are not identifying isomorphic 
graphs, i.e., two trees are counted as the same if and only if exactly the same pairs of vertices are adjacent. 
Thus there are at most {2k) 2k ~ 2 such trees just on set S. Consider any tree T among these trees and root it 
at the source point with minimum label. Here we assume that T has been generated using the "Principle of 
Deferred Decisions", i.e., nodes will be generated one by one according to some breadth-first ordering of T. 
We say that an edge is short if its length is at most A- (c and a £ (0, ^) will be fixed later). 

If T has length at most c, it is clear that at most an a fraction of its edges are not short. So Pr [length{T) < 
c] < ^ H Pr[edges in H are short], where H in the summation ranges over all edge-subsets in T 
with \H\ > (1 — a)2k. For a fixed H, we bound Pr[edges in H are short] as follows. For any edge 
(v, parent(f)) (note parent(u) is well-defined since T is rooted), assuming that parent(f ) is fixed, the prob- 
ability that this edge is short is p = vr(^) 2 . So we can upper bound the probability that edges H are short 
by p\ H \ < p( 1 - a ) 2k _ So we have Pr[length(T) < c] < 2 2k ■ p( 1 - a ) 2fc ) as the number of different edge sets 
H is at most 2 2k . 

By a union bound over all such labeled trees T, the probability that the length of the minimum spanning 
tree on S is less than c is at most (2k) 2k ■ 2 2k ■p( 1 ~ a ) 2k . Now taking a union bound over all fe-sets S, the prob- 
ability that the minimum length of a tree containing k pairs is less than c is at most (?) (2k) 2k 2 2k p( 1 ~ a ^ 2k . 
Since k = y/n, this term can be bounded as follows: 

(ek) k (4k) 2k Tr^ 2k (^-Y 1 ~ a ^ k < 500 k k 3k (^ 1 - a ^ k = [500 • (^f-^(\f~^f < 2 ~ k 
ak ak a k 

The last inequality above holds when c < • ^ 1 / 4 - 3a /( 1 - 4a ). Setting a = we get 

Pr[3 length tree containing k pairs in 1Z] < 2~ k 

8000 • log k 

So, with probability at least 1 — 2"^, the minimum length of a tree containing k pairs in 1Z is at least 

v log n ' 

From Theorem 13 we obtain that there is a near optimal non-preemptive tour servicing all the demands 
in segments, where each segment (except possibly the last) involves servicing a set of | < t < k demands. 
Although the lower bound of k/2 is not stated in Theorem |7J it is easy to extend the statement to include it. 
This implies that any solution of this structure has at least f = k segments. Since each segment covers at 
least k/2 pairs, Claim [T8limplies that each of these segments has length ^(n 1 / 8 / log n). So the best solution 
of the structure given in Theorem |7] has length n(JLL-jfe). But since there is a near-optimal solution of this 

1 /8 

structure, the optimal non-preemptive tour on 1Z has length ^( t " g 2 - k). 

On the other hand, the flow lower bound for 1Z is at most ? = k, and the Steiner lower bound is at most 
0{^Jn) = 0(k) (an 0(y/n) length tree on the 2n points can be constructed using a v2n x \^2n gridding). 
So the preemptive lower bounds are both 0(k); now using the algorithm of MCR981 , we see that the optimal 
preemptive tour has length 0(k log n). Combined with the lower bound for non-preemptive tours, we obtain 
the Theorem. ■ 

Acknowledgements: We thank Alan Frieze for his help in proving Theorem [TTl 
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